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We characterize the statistical properties of a large number of online auctions run on 
eBay. Both stationary and dynamic properties, like distributions of prices, number of bids 
etc., as well as relations between these quantities are studied. The analysis of the data 
reveals surprisingly simple distributions and relations, typically of power-law form. Based 
on these findings we introduce a simple method to identify suspicious auctions that could 
be influenced by a form of fraud known as shill bidding. Furthermore the influence of 
bidding strategics is discussed. The results indicate that the observed behavior is related 
to a mixture of agents using a variety of strategies. 
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1. Introduction 

The Internet has triggered new forms of commerce. Not only has it become possible 
to order almost every product comfortably from your home, but also different forms 
of trading have become popular. Here especially the success of online auction sites 
is remarkable. Through the Internet participation in auctions is no longer just for 
a minority, but allows millions of consumers all over the world to buy and sell 
almost any kind of product. Nowadays, auction sites rank very high in the number 
of visitors, and also in the average time spent per visit. 

Understanding the interactions between users is not only interesting fro m an 
economic point of view. Surprisingly, despite the success of econophysic J^^ ( only 
very few studies of the statistical properties of online auctions by physicists exisfl"™. 
Ref. 01 focused on properties of agents (i.e. bidders and sellers) in auctions, e.g. the 
number of agents participating or the frequency of bids placed. It was found that on- 
line auctions can be considered as complex systems that exhibit emergent behaviour. 
In Ref. 0] the relation between different agents was investigated by analyzing the 
structure of the interaction network resulting from online auctions. 

Online auctions are conducted in a different way than the standard English 
auction which usually ends when nobody is willing to bid higher ("soft close"). In 
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contrast, online auctions end at a fixed time known to all agents ("hard close"). 
Any agent offering an item at an online auction house like eBay first has to specify 
the starting time, starting price and duration of the auction. Each new bid placed 
by another agent has to exceed the currently listed price by a preset increment. 
Agents can either bid incrementaly by placing a bid that corresponds to the current 
price plus the minimal acceptable increment, or take advantage of proxy bidding. 
In proxy bidding an agent indicates the maximum price he/she is willing to pay 
for the given item (proxy price), which is not disclosed to other bidders. Each time 
a bidder places a new bid, the auction house makes automatic bids for the agent 
with an active proxy bid a , outbidding the last bid with a fixed increment until the 
proxy price is reached. The agent with the highest bid wins. 

Due to the proxy system eBay auctions are very similar to second price 
auctioniP . The winner (buyer) does not pay a price corresponding to his actual 
winning bid. Instead the final price is determined by the second highest bid, plus 
the preset increment. 

From a physics point of view the development of a simple microscopic model for 
online auctions appears to be quite challenging. In contrast to most other processes 
it is essential that the dynamics ends at a certain time and that this fact is known to 
all the agents. As we will show below, this is clearly reflected in the empirical data. 
We will not deal with the problem of modelling here, but instead try to determine 
the generic statistical properties of online auctions empirically. These data might 
then be used in order to test the validity of model approaches. 

eBay keeps a detailed record of the bidding history that is publicly available. 
Bidding agents are distinguished by a unique user name. This allows to study the 
dynamics of the bidding process in a quantitative way. In this paper, we analyze 
and characterize the statistical properties of online auction data. We focus on the 
properties of the auctions and especially the relations between various quantities 
(price, number of bids placed etc.). Furthermore we also study the dynamics of the 
bidding process, not only the stationary properties after the end of the auction. 
This allows interesting insights into the strategies used by the agents participating 
in the auction. 

2. Data collection 

For our investigation we have used two major sets of auction data from eBay Ger- 
many (www.ebay.de) that allow us to focus on various properties in more detail. 
Many of our results are found to be valid for both data sets. 

The first set (DB-1) comprises data collected from auctions running on March 
22, 2004. We focus on auctions with the label "OVP" b in the title, indicating a 
new product. The data comprises 173,315 auctioned items, grouped in 9904 subcat- 

a Note that at most one active proxy bid can exist at any time. 

b Abbreviation for 'Originalverpackung', meaning 'original wrapping' or 'sealed'. 
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egories by eBay. 262,508 distinct agents bidding on items and 43,500 sellers offering 
auctioned items are identified. 

The second set (DB-2) are data collected over 10 months in the subcategories 
"web projects" and "websites & domains". The auctions involve 11,145 agents bid- 
ding on 52,373 items. 

The data set DB-1 allows to study auctions where the majority of products has 
a well-defined value (e.g. market price) for all agents. In contrast, the set DB-2 
comprises items where the value can be different for different agents. 



3. Statistical analysis 

Using these data one finds that the distribution of the total number of bids placed 
by a given agent follows a power-law for both data sets. The total number of distinct 
items offered by a given agent (as seller) also follows a power-law distribution. 

However, first we have examined distributions of static properties (i.e. properties 
at the end of an auction) . A good qualitative agreement with the results of Ref . 3 
is observed. E.g. the distributions of the distinct number of agents n age nt simulta- 
neously bidding on a certain item and the total number of bids nuds received for 
an item both follow exponential distributions P(n) ~ exp(— n/no), where no = 6.5 
for nbids and no = 2.9 for n agcnt . This is in agreement with Ref. 3, where the values 
no = 5.6 for nbids and no = 2.5 for n agcnt for eBay and no = 10.8 for nbids and 
no = 7.4 for n agent for eBay Korea were obtained. 

The activity of individual agents as bidder or seller follows power-law 
distributional One can find that the distribution of the total number of bids placed 
by the same agent, denoted by nbids, follows a power law 

^Kids) ~ "bids' (!) 

where 7 = 1.9. 

In order to quantify the bidding process we define a dimensionless variable that 
we call return g. It is the relative increase of the submitted bid b: 

b Pcurrent /~\ 

Q= , (2) 

Pcurrent 



where p curre nt is the current or listed price just before the bid is placed. In analogy 
to the quantity used in financial market J ' ' it measures relative changes. Very 
large values of the return are usually generated by the first submitted bid, if the 
bidder follows the recommendations of eBay and submits the maximum price (s)he 
is willing to pay. The distribution of g is found to follow a power law for almost 
three orders of magnitude with exponent —2.44 (Fig.HJ. The form of the distribution 
appears to be quite stable even for different time spans of the bid submission^. 

Although the bids are correlated, the returns show only very short-ranged corre- 
lations. By computing the correlation function cy = {gigj) — (gi) (gj) of the returns 
(indices i denote the chronological order of arriving bids; averaging is done over 
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Fig. 1. The cumulative probability distribution of returns q follows a power law with exponent 
-1.44. 



all auctions) one finds that off-diagonal terms Cy with i ^ j are very smalM This 
result is rather surprising and we have currently no simple explanation. 

One of the interesting questions is the functional dependenc e of th e price on 
other dynamic parameters such as number of bids. Several studieJ^^ investigate 
the influence of static parameters like the ending time (which day of the week, 
on which daytime), start price, reputation of the seller etc. Much less is known 
(quantitatively) about relations between dynamic parameters. 

We have determined the dependence of the price on other dynamic parameters. 
For the relation between the final price p nna i and number of bids nbid placed on the 
item a power law is found: 

(Pfinaj) oc ng id , (3) 

where a — 1.58 for the data set DB-2 and a — 1.53 for the data set DB-1 (Fig. EJ. 

This functional dependence seems to be universal and independent of the sub- 
categories or time intervals used for data collection™ Especially there seems to be 
no significant difference between the two data sets DB-1 and DB-2. 

Empirically we find that the distribution of the final price for fixed total number 
of bids is well described by a log-normal distribution®. A possible explanation uses 
the observation that the returns pi are almost uncorrelated. Assuming b = P n in 
PI , the price P n after the n-th bid is given by a multiplicative process of the form 
Pn = [Qn + l)-Pn-i- Since the returns g n are found to be uncorrelated, ln(P n ) = 
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Fig. 2. Relation between final price (pfl na i) (in Euro, for starting price 1 Euro) and number of 
bids n bid for different days based on subsets of DB-1. The straight line is a power-law fit with 
exponent a = 1.53. 



Y^i=i in (ft + 1) converges to a Gaussian distribution for large n. 

In the following we discuss a possible origin of systematic deviations from the 
average behaviour described above, namely fraud. 



3.1. Skill bidding 

It is not uncommon that certain agents try to manipulate the sell price of the items 
they offer. Therefore we distinguish two different types of bidding behaviour: 1) 
All bidders try to keep the price as low as possible; 2) At least one bidder tries 
to push the price higher. The second type of behaviour is known as shill bidding. 
Shill bidding (also known as "bid padding") is the deliberate use of secondary 
registrations, aliases, family members, friends, etc. to artificially drive up the price 
of an item. This is strictly forbidden by eBajU^, but nevertheless happens quite 
frequently. Usually this sort of manipulation can be identified only after the auction 
has ended because the whole purpose of shill bidding is increasing the price without 
winning in the end! Shill bidding occurs mostly in auctions of products that do not 
have a well-defined value, e.g. market price. 

We have tried to identify this sort of manipulation through the statistical prop- 
erties. Indeed, successful shill bidding leads to auctions which show clear devia- 
tions from the observed simple statistical laws. To quantify this we have defined 
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a confidence interval based on the assumption that the price follows a log-normal 
distribution (see inset in Fig. |3J). The variance a 2 of the Gaussian distribution of 
the logarithm of prices with a fixed number of bids characterizes the confidence 
interval which is indicated in Fig. [21 by the two straight lines with slopes 1.32 and 
1.92, respectively. 

We have performed two tests by a) inve stig ating: the statistical properties of 
auctions identified e.g. from discussion forums'^ as shill bidding, and b) by checking 
whether randomly chosen auctions outside the confidence interval of the price-bid 
relationship show indications of shill bidding. Both tests require an extensive amount 
of work, e.g. the investigation of the tra ding history of the seller over a long time 
or monitoring eBay discussion forumJ^. Fig. shows a comparison between these 
shill-auctions and the average behaviour of all auctions (DB-1 and DB-2), clearly 
indicating systematic deviations. 

For test a) we have chosen 9 auctions that clearly have been identified as ma- 
nipulated by shill bidding, e.g. through information from discussion forumM Only 
after that we have investigated the bidding history of these auctions in more detail. 
Fig. shows that all of those, except for one, are clearly outside the confidence 
interval. For test b) 10 auctions outside the confidence interval have been chosen 
randomly. These have been checked thoroughly for indications of shill bidding. This 
also required investigating other auctions by the same seller etc. In this way we 
have found a clear indication for shill bidding in 7 of the 10 auctions. 

The results of the tests indicate that it would be sufficient to check the suspicious 
auctions (i.e. those outside of the confidence interval) in more detail. This would 
reduce the complexity of detecting manipulations drastically. Furthermore these 
deviations usually can be observed even before the end of the auction. Monitoring 
the current price as function of the number of submitted bids up to the time of 
observation, one can check whether this price is inside or outside of the confidence 
interval in Fig. [21 In the latter case, a high current price is an indication that bidders 
should be cautious (although it is no proof of shill bidding). 



3.2. Bid submission times 

Second price auctions were originally introduced by Vickrej0 to make everyones 
bidding strategy independent of strategies used by the other bidders. It could be 
shown that late bidding or bidding with a price less than one is willin g to pay 
are not optimal strategies. Using game theory based 

model JlH|1 7|18|1 9| 

it seems 

hard to understand why bidders bid more than once. We analysed the collected 
data and found that bidders prefer to bid close to auction ending times. Fig. 01 
shows the cumulative distribution of bid submission times as a function of the time 
remaining until the end of the auction. Two regimes with exponential behaviour can 
be observed related to the most common auction lengths of 7 and 10 days. Both parts 
are well described by P(At) ~ exp(-At/T ) with T = 68.94 h. Close to the end 
of the auctions so-called sniping leads to an algebraic distribution P(At) ~ (At) -7 
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Number of Bids 

Fig. 3. Distribution of sell prices as function of the number of bids placed. Only auctions with 
a starting price of 1 Euro are considered. Gray dots correspond to the average sell price for 
fixed number of bids. The broken lines specify the confidence interval (one standard deviation) 
determined from a log-normal fit (see inset). This is shown for nt,id = 20 in the inset. For other 
values of very similar results are obtained. Black dots indicate auctions identified as shill 

bidding using the criteria of Refs. 12-14 (test a). Triangles denote auctions that have been tested 
for possible shill bidding (test b). For black triangles strong indications for shill bidding have 
been found, whereas auctions corresponding to white triangles are unsuspicious. Light-gray and 
dark-gray colours denote regimes with high or low probability of shill bidding, respectively. 

with 7 = 1.1. Sniping is a special bidding strategy where the agent tries to submit 
a winning bid just before an auction closes in order to prevent other bidders from 
outbidding the sniper or driving the pri ce h igher. In fact, game-theoretic arguments 
show that sniping can be advantageou JH e.g. to avoid bid wars or auction fraud 
due to shill bidding. 

Surprisingly, the distribution of bid submissions times is qualitatively very sim- 
ilar both data sets DB-1 and DB-2. Although the scales in the exponential part 
differ slightly (T = 68.94 h for DB-2 and T Q = 90.09 h for DB-1), the exponents 
of the algebraic part are almost identical. From earlier investigation^^ one could 
have expected a qualitative difference between the two data sets. 

4. Discussion 

eBay online auctions can be considered as strongly correlated processes. Correlations 
are due to influences of standard market prices, supply and demand mechanisms 
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Fig. 4. Cumulative distribution of bid submission times as a function of the time remaining until 
the end of the auction (for DB-2). Two regimes with exponential behaviour can be observed related 
to the most common auction lengths of 7 and 10 days (i.e. 168 and 240 hours). Both parts are well 
described by P(At) ~ exp(— At/To) with To = 68.94 h. Close to the end of the auctions sniping 
leads to an algebraic distribution P(At) ~ (Ai) ' with 7 = 1.1. The data set DB-1 shows a very 
similar behaviour with Tq = 90.09 h and 7 = 1.15 (Ref. 6). 



within eBa y auctions and seller's feedback information available for any visitor of 
the site 15 . We have shown several emerging properties, which could be seen as 
an evidence of complexity. The generic statistic behavior of agents leads to a very 
simple relation between price and number of bids. As an application we have found 
that a kind of fraud known as shill bidding leads to significant deviations from the 
average behaviour. 

Econom ic theor ies have mainly used game theoretic methods to describe 
auctionJ^fl^»l These are based on the assumption of rational agents using 
specific bidding strategies, like evaluation (placing one's true value, usually early in 
the auction), unmasking (bidding as long as someone else is the highest bidder, e.g. 
to determine the highest bid), or incremental bidding (placing minimum acceptable 
bids). Our results indicate that eBay can be regarded as a complex system con- 
sisting of a mixture of agents using different strategies. In future work we intend 
to determine the influence of the rationality of the agents (in the sense of game 
theory) . 

As mentioned in the introduction, to our knowledge so far no microscopic 
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stochastic model for the description of online auctions exists. Most approaches use 
game theoretical methods which focus on the effects of different strategies. Our 
empirical data indicate that a model needs to take into account that a mixture 
of agents with different strategies exists which is responsible for the observed be- 
haviour. This can be seen most clearly in the distribution of bid submission times. In 
order to take into account the hard close of online auctions a naive approach would 
use dynamical rules which are itself time-dependent, which would be a challenge for 
theoretical studies of the model. 
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